An efficient algorithm for parallel distributed unsupervised learning
نویسندگان
چکیده
This paper presents a new technique for parallel and distributed unsupervised learning. It arises from a detailed analysis of the weaknesses of several, existing algorithms, the most important of which is the presence of intrinsically serial operations. The basic idea of this work, therefore, is the substitution of this latter with new operations, as similar as possible to the original ones, but better suited to a parallel implementation. The result is a notable increase in speed-up; the price to be paid is a slight deterioration in the precision of the clustering process. The ideal applications for the new algorithm are very complex problems with a high number of patterns and classes. r 2007 Elsevier B.V. All rights reserved.
منابع مشابه
Static Task Allocation in Distributed Systems Using Parallel Genetic Algorithm
Over the past two decades, PC speeds have increased from a few instructions per second to several million instructions per second. The tremendous speed of today's networks as well as the increasing need for high-performance systems has made researchers interested in parallel and distributed computing. The rapid growth of distributed systems has led to a variety of problems. Task allocation is a...
متن کاملA Hybrid Unconscious Search Algorithm for Mixed-model Assembly Line Balancing Problem with SDST, Parallel Workstation and Learning Effect
Due to the variety of products, simultaneous production of different models has an important role in production systems. Moreover, considering the realistic constraints in designing production lines attracted a lot of attentions in recent researches. Since the assembly line balancing problem is NP-hard, efficient methods are needed to solve this kind of problems. In this study, a new hybrid met...
متن کاملParallelized Unsupervised Feature Selection for Large-Scale Network Traffic Analysis
In certain domains, where model interpretability is highly valued, feature selection is often the only possible option for dimensionality reduction. However, two key problems arise. First, the size of data sets today makes it unfeasible to run centralized feature selection algorithms in reasonable amounts of time. Second, the impossibility of labeling data sets rules out supervised techniques. ...
متن کاملHigh-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملTux2: Distributed Graph Computation for Machine Learning
TUX2 is a new distributed graph engine that bridges graph computation and distributed machine learning. TUX2 inherits the benefits of an elegant graph computation model, efficient graph layout, and balanced parallelism to scale to billion-edge graphs; we extend and optimize it for distributed machine learning to support heterogeneity, a Stale Synchronous Parallel model, and a new MEGA (Mini-bat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Neurocomputing
دوره 71 شماره
صفحات -
تاریخ انتشار 2008